Hierarchical Brushing of High-Dimensional Data Sets Using Quality Metrics
نویسندگان
چکیده
In this paper, we present an interactive exploration framework that puts the human-in-the-loop with the application of quality metrics and brushing techniques for an efficient visual analysis of high-dimensional data sets. Our approach makes use of the human ability to distinguish interesting structures even within very cluttered projections of the data and uses quality metrics to guide the user towards such promising projections which would otherwise be difficult or time-consuming to find. Brushing the data creates new subsets that are ranked again using quality metrics and recursively analyzed by the user. This creates a human-in-the-loop approach that makes use of hierarchical brushing and quality metrics to support interactive exploratory analysis of high-dimensional data sets. We apply our approach to synthetic and real data sets, demonstrating its usefulness.
منابع مشابه
High-Dimensional Unsupervised Active Learning Method
In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...
متن کاملContext-aware Modeling for Spatio-temporal Data Transmitted from a Wireless Body Sensor Network
Context-aware systems must be interoperable and work across different platforms at any time and in any place. Context data collected from wireless body area networks (WBAN) may be heterogeneous and imperfect, which makes their design and implementation difficult. In this research, we introduce a model which takes the dynamic nature of a context-aware system into consideration. This model is con...
متن کاملFeature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach
Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...
متن کاملStructure-Based Brushes: A Mechanism for Navigating Hierarchically Organized Data and Information Spaces
Interactive selection is a critical component in exploratory visualization, allowing users to isolate subsets of the displayed information for highlighting, deleting, analysis, or focussed investigation. Brushing, a popular method for implementing the selection process, has traditionally been performed in either screen space or data space. In this paper, we introduce an alternate, and potential...
متن کاملMarkov Chain Driven Multi-Dimensional Visual Pattern Analysis with Parallel Coordinates
Parallel coordinates is a widely used visualization technique for presenting, analyzing and exploring multidimensional data. However, like many other visualizations, it can suffer from an overplotting problem when rendering large data sets. Until now, quite a few methods are proposed to discover and illustrate the major data trends in cluttered parallel coordinates. Among them, frequency-based ...
متن کامل